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Cells use surface receptors to estimate the concentration of external ligands. Limits on the accu¬ 
racy of such estimations have been well studied for pairs of ligand and receptor species. However, the 
environment typically contains many ligands, which can bind to the same receptors with different 
affinities, resulting in cross-talk. In traditional rate models, such cross-talk prevents accurate infer¬ 
ence of individual ligand concentrations. In contrast, here we show that knowing the precise timing 
sequence of stochastic binding and unbinding events allows one receptor to provide information 
about multiple ligands simultaneously and with a high accuracy. We argue that such high-accuracy 
estimation of multiple concentrations can be realized by the familiar kinetic proofreading mechanism. 


Introduction : Cells obtain information about their en¬ 
vironment by capturing ligand molecules with receptors 
on their surface and estimating the ligand concentration 
from the receptor activity. Limits on the accuracy of such 
estimation have been a subject of interest since the semi¬ 
nal work of Berg and Purcell |TJ, with several substantial 
extensions found recently HHH] All of these assume one 
ligand species coupled to one receptor species. However, 
cells carry many types of receptors and have many species 
of ligands around them. The same ligands can bind to 
many receptors, albeit with different affinities, and vice 
versa. This is commonly referred to as cross-talk. 

In traditional deterministic chemical kinetics, one can¬ 
not estimate concentrations of more ligands than there 
are receptor types. Further, even a weak cross-talk pre¬ 
vents determination of concentrations of individual chem¬ 
ical species since activity of a receptor is a function of a 
weighted sum of concentrations of all ligands that can 
bind to it. In contrast, here we argue that, with cross¬ 
talk, concentration of more than one chemical species 
can be inferred from the activity of one receptor, pro¬ 
vided that the entire stochastic temporal sequence of 
receptor binding and unbinding events is accessible in¬ 
stead of its mean occupancy. This surprising result can 
be understood by noting that a typical duration of time 
that a ligand remains bound to the receptors depends on 
its unbinding rate. Thus observing the statistics of the 
receptor’s unbound time durations allows estimation of 
a weighted average of all chemical species that interact 
with it [5], and then observing the statistics of the bound 
time durations allows to tell how common each ligand is. 

In this article, we derive these results for the simplest 
problem of the class, namely one receptor interacting 
with two ligand species. While the exact solution of the 
inference problem for finding both ligand concentrations 
is hard to implement using common biochemical machin¬ 
ery, we show that an accurate approximation is possible 
using the familiar kinetic proofreading mechanism ia no]. 

The Model : Consider a single receptor estimating con¬ 
centrations of a cognate and a non-cognate ligand, Fig. [l] 
The ligands bind to the receptor with on-rates k c and 
k nc . These are proportional to the ligand concentra- 
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FIG. 1: The model, (a). Two lignads, cognate and non¬ 
cognate, bind to a receptor R with binding rates k c and k nc , 
respectively. The cognate unbinding rate is defined as lower 
than the non-cognate one (r c < r nc ). (b) Time series of re¬ 
ceptor occupancy is used to determine both on-rates. 


tions with known coefficients of proportionality. Thus 
estimating k c ^ nc is equivalent to estimating the concen¬ 
trations themselves. The unbinding, or off-rates, r c and 
r nc , distinguish the two ligands: r nc > r c , and a cog¬ 
nate molecule typically stays bound for longer. Follow¬ 
ing Ref. |5j, we estimate k c and k nc from the time-series 
of binding, { t b }, and unbinding, {t- 1 } events of a total 
duration T using Maximum Likelihood techniques. The 
numbers of binding and unbinding events are different 
by, at most, one, which is insignificant since we consider 
T —)> oo. Thus without loss of generality, we assume that 
the first event was a binding event at t\, and the last 
one was the unbinding at Uf. We write the probability 
distribution of observing the sequence {t\, tf ,..., £ b , £^}, 
or alternatively the sequence of binding and unbinding 
intervals r b =Vf — t^, and rf = £ b +1 — tf: 
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Here the first term under the product sign is the proba¬ 
bility of the receptor staying unbound for rf . The second 
term, which we from now on denote by D(k c , k nc , r b ), is 
proportional to the probability of staying bound for r b , 
which has contributions from being bound to the cog¬ 
nate and the noncognate ligands, with odds of k c /k nc . 
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Finally, Z is the normalization. Note that here we define 
T n = t\ + (T — tJJ), so that the n’th unbound interval 
includes the “incomplete” unbound intervals before the 
first binding and after the last unbinding. 

The log-likelihood of & c?nc is the logarithm of P, 
Eq. 0 . Taking the derivatives of the log-likelihood 
w. r. t. k c and k nc and setting them to zero gives the 
Maximum Likelihood (ML) equations for the two con¬ 
centrations. Denoting by T u = Y^i=i 7 1 the total time 
the receptor is unbound, these are 
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where * denotes the ML solution. Multiplying Eqs. 0 
[3]) by k* and fc* c . respectively, and adding them gives 
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which determines the sum of the two concentrations, 
showing that the estimates are negatively correlated. As 
in Ref. [5], the total on-rate (the weighted average of the 
external concentrations) is determined only by the aver¬ 
age duration of the unbound interval, (n/T u ) -1 , because 
no binding is possible when the receptor is already bound. 

In general, the ML equations cannot be solved analyt¬ 
ically, requiring numerical approaches. However, as all 
ML estimators, they are unbiased to the leading order 
in n. The standard errors of the ML estimates can be 
obtained by inverting the Hessian matrix, 
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where • stands for {c,nc}. The inverse of d d l°Q^ , which 
scales as oc 1/n, sets the minimum variance of any unbi¬ 
ased estimator according to the Cramer-Rao bound. It 
has straightforward analytical approximations in various 
regimes. For example, for k c /k nc 1 and r c /r nc 1, 
when the noncognate ligand is almost absent, and its 
few molecules do not bind for long, one gets cr 2 (&*) ~ 
(d 2 log P/dk%) k _ k * ~ 1/n, matching the accuracy of 
sensing one ligand with one receptor [5]. A regime rel¬ 
evant for detection of a rare, but highly specific ligand 
[lTJ[T2]) can be investigated as well. Instead, we focus on 
how the receptor estimates (rather than detects) concen¬ 
trations of both ligands simultaneously, which requires us 
to investigate the full range of on-rates. 

To study the variability of the ML estimator, we define 
its error as P c ,nc = n(j 2 Knc)Acno the squared coeffi¬ 
cient of variation, multiplied by n, which has a finite limit 


at n 00 . E — 1 corresponds to the accuracy that a 
receptor measuring a single ligand would obtain [5]. We 
show log 10 E for different on- and off-rates in Fig. §. 
If the two ligands are readily distinguishable, r c r nc , 
then the ligand with the dominant k has E ~ 1. When 
k c ~ k nc , E. ~ 4... 5, and it grows to 10 ... 30 for a lig¬ 
and with a very small relative on-rate. Emphasizing the 
importance of the time scale separation, E > 100 if the 
ligands are hard to distinguish, r c ~ r nc . Here, in ad¬ 
dition, the correlation coefficient p of the two estimates 
reaches —1 because the same binding event can be at¬ 
tributed to either ligand. Finally, the asymmetry of the 
plots w. r. t. the exchange of k c and k nc is because the 
cognate ligand can generate short binding events, while 
long events from the noncognate ligand are exponentially 
unlikely. In summary, it is possible to infer two ligand 
concentrations from one receptor, with the error of only 
1... 10 times larger than for ligand-receptor pairs with no 
cross talk, as long as the two off-rates are substantially 
different. 

Approximate solution. Solving Eqs. §i to find the 
ML on-rates would be hard for the cell. Luckily, an ap¬ 
proximate solution exists. To find it, we notice that most 
of the long binding events come from the cognate ligand 
since the noncognate one dissociates faster. Defining long 
events as r b > T c , we rewrite Eqs. ([2j[4| as 
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Assuming that almost all long events are cognate, T c >> 
\jv nc 5 this gives 
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where n\ is the number of long events, and the superscript 
“a” stands for the approximate solution. If further T is 
long enough so that there are many short events, and a 
single binding duration hardly affects &*, then the sum 
in Eq. 0 can be approximated by the expectation value: 
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where P(r b | , k^ c ) is the probability of observing a bind¬ 

ing event of the duration r b for the given binding rates, 
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Plugging Eq. 0 into Eq. (|8|, we obtain 
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FIG. 2: Variability of the ML estimators, represented by log 10 E c (left), log 10 E nc (center), and the correlation coefficient 
p between k* and k* c (right) as functions of k. and r .. Here we use r nc = k c + k nc = 1. The plotted quantities are estimated 
as averages over 30,000 randomly generated binding/unbinding sequences for each combination of the rates. Each sequence 
consists of n = 30, 000 binding events, simulated using the Gillespie algorithm. Standard errors are too small to be represented. 


Finally, since n\ <C n, using Eq. we get: 
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In other words, the approximate cognate ligand concen¬ 
tration is proportional to the number of long events. 

We can estimate the bias and the variance of k a nc in 
a limiting case. If r c and r nc are not very different from 
each other, then T c must be much larger than the in¬ 
verse of either of them, T c {r^ 1 , r” 1 }, and n\ n. 
Then most of the variance of k a nc in Eqs. (11, 12) comes 
from variability of 7q, but not T u . Thus we write (k a ) ~ 
-^^e rcTC . Further, the individual unbound periods are 
independent, so that (T u ) = n(r u ) = n/(k c + k nc ) (no¬ 
tice the use of k rather than k a in this relation). Further, 
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Combining these expressions, we get 
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Thus for large T c , the bias of the approximate estima¬ 
tor, & nc e - ( rnc-rc ) TC , grows with the relative number of 
noncognate long bindings events. In turn, the latter is 
proportional to fc nc , but decreases exponentially with T c . 

Within the same approximation, the variance of the es¬ 
timator is cr 2 (k a ) ~ e 2rcT< . But long binding events 
are rare, independent of each other, and hence obey the 
Poisson statistics. Thus cr 2 (n\) = (ni), so that 


CJ 2 (kl) « (fc c a ) — e r ° TC 


(14) 


The variance obviously grows with T c . 

Knowing that the bias and the variance of the approx¬ 
imation change in opposite directions with T c , we can 
find the optimal cutoff by minimizing the overall error, 
or, in other words, solving the bias-variance tradeoff: 


T* = arg min L = arg min ^ 
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where L is the sum of the squared bias and the variance 
of the estimator. Near the optimal cutoff, the bias is 
small, and we use k c instead of k a for the variance of the 


estimator, Eq. (14). Then solving Eq. (15) gives: 
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Plugging this into Eqs. (13j 14|, we can get the minimal 
error of the estimator, which we omit here for brevity. 

The optimal cutoff is oc 1/ r nc if r nc r c , and it grows 
with r c , allowing for better disambiguation of cognate 
and noncognate events. Crucially, the off-rates are spec¬ 
ified with the ligand identities. In contrast, the on-rates, 
k c nc , are what the receptors measures. Therefore, it is 
encouraging that T c depends only logarithmically on the 
on-rates (and also on the duration of the measurement, 
T u ): fixing T c as T £ at some fixed values of k c ^ nc re¬ 
mains near-optimal for a broad range of on-rates. To 
illustrate this, we use T c = T£(k c = k nc = 1/2) = To 
and analyze the quality of the approximation in Fig. [3j 
where we plot the ratio L c?nc (To)/cr| c . Since the ratio 
approaches 1 when r c /r nc —>> 0 (specifically, for r c /r nc = 
0.1, L c (T 0 )/al c « 1.47, and L nc (T 0 ) /« 1.21), we 
conclude that the approximation is accurate even at 



















4 



FIG. 3: Comparison of errors of the approximate and the ML solutions. We plot log 10 (L c (To)/c>k*) (left), 
log 10 (L nc (T 0 )/crg* ) (center) and the covariance of the approximate estimates (right) as functions of on- and off-rates. Simula¬ 
tions are performed in the same way as in Fig. [2] 


fixed T c = Tq when its assumptions are satisfied. In 
contrast, when the ligands are nearly indistinguishable, 
-^c,nc 100, but here one would not use one 

receptor to estimate two concentrations since even the 
ML solution is bad (cf. Fig. [2|. Note also that both L c 
and L nc are smaller for r c ~ r nc if k c k nc . This is be¬ 
cause our main assumption (that almost all long events 
are cognate) holds better when cognate ligands dominate. 
Finally, the correlation coefficient between the approxi¬ 
mate estimates, p a (right panel) reaches -1 earlier than 
in Fig. [2] This is a direct consequence of Eqs. (11 12). 

Kinetic Proofreading for approximate estimation. The 
approximate solution can be computed by cells using 
the well-known kinetic proofreading (KPR) mechanism 
mmmm- In the simplest model of KPR m , inter¬ 
mediate states between an inactive and an active state of 
a receptor delay the activation. Thus bound ligands can 
dissociate before the receptor activates, at which point 
it quickly reverts to the inactive state. Since r c > r nc , 
cognate ligands dominate among bindings that actually 
lead to activation. The resulting increase in specificity 
in various KPR schemes has led to their exploration in 
the context of detection of rare ligands mmm, and 
here we extend them to measurement of concentration of 
cognate and noncognate ligands simultaneously. 

Consider a biochemical network in Fig.[4j the receptor 
(R) activates two messenger molecules (A) and (B). The 
first one is activated with the rate k A whenever the re¬ 
ceptor is bound. The second one is activated only if the 
receptor stays bound for longer than a certain T c (with 
the delay achieved using the KPR intermediate states). 
The activation rate after the delay is k B . The molecules 
deactivate with the rates r A and re, respectively, and all 


11 12 



(U 


FIG. 4: Kinetic Proofreading for estimating multiple 
concentrations. Molecules A and B are produced when the 
receptor is bound, but A is produced only for long bindings. 
Another chemical species C subtracts A from B, so that A 
approximates k c and C approxiates k nc - 


activations/deactivations are first-order reactions. Then 
the mean concentrations of the messenger molecules are: 


^ = fcc/r-c + knc IT *nc k A 

1 + k c /r c + knc/rnc r A 
B _ k c /r c e~ rcTC + k nc /rn C e~ rncTC k B 
1 T k c /r c ~b knc/^nc G3 
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Assuming again that most bindings longer than T c are 
cognate, we solve Eqs. (17, 18) for the on-rates 
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The corrections of the form A/ik^/r^ — A) appear be¬ 
cause bindings only happen to unbound receptors, as 
emphasized in Ref. [5j. However, these nonlinear rela¬ 
tions are still hard to implement with simple biochem¬ 
ical components. We solve this by further assuming 
e = A/{kx/r\) 1, which is true if the receptor is 

mostly unbound (both on-rates are small compared to 
the respective off-rates). This gives 


k 


kpr ^ Be rcT r c r b 




.kpr _ (_ Bj__ rB 

nc ~ V k B 


( 21 ) 

( 22 ) 


These equations are analogous to Eqs. (11 12). They are 
easy to realize biochemically (cf. Fig. |4|): k c is related to 
the concentration of the proofread species B by a rescal¬ 
ing, and k nc comes from subtracting rescaled versions of 
A and B from each other. The subtraction can be done 
by the third species C, activated by A and suppressed 
by B. Since e <C 1, then A and B are small, and many 
such activation-suppression schemes are linearized as the 
subtraction [8]. 

The bias of k^ nc due to long, but noncognate binding 
events, Eq. (|13|, carries over to However, there 

is an additional contribution since the time to traverse 
the intermediate states is random. Thus T c has some 
variance a\ c mm- This variability changes the rate of 
occurence of long biding events, but they are still rare, 
nearly independent, and Poisson-distributed. Denoting 
by (•) the averaging at a fixed T c , and by 7 the averaging 
over T c , we get 
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Thus o \ c effectively renormalizes the cutoff to T c — 

Replac- 


Jrycr^c, which is independent of the on-rates. 
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22) by its renormalized value, which 


is an easy change in the scaling factors, removes this ad¬ 
ditional bias due to the random T c in the KPR scheme. 

Since long bindings are rare, the variance of the KPR 
estimator is dominated again generally by H, but not A. 
The intrinsic stochasticity in production of molecules of 
B contributes to the variance. However, this contribution 
can be made arbitrarily small by increasing &b, and we 
neglect it here. A larger contribution comes from the 
random number of long bound intervals and a random 
duration of each of them. To calculate this, in the limit 
of rare long binding events, we use well-known results in 
the theory of noise propagation in chemical networks m 


(l + k c /r c + k nc /r nc )e rcTC 


B 2 
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This is a direct analog of Eq. (14). 

Discussion. The realization of Refs. 0HH] and oth¬ 
ers that the detailed temporal sequence of binding and 
unbinding events carries more information about the lig¬ 
and concentration than the mean receptor occupancy is 
a conceptual breakthrough. It parallels the realization 
in the computational neuroscience community that pre¬ 
cise timing of spikes carries more information about the 
stimulus than the mean neural firing rate [19H24] , and it 
has a potential to be equally impactful. This extra in¬ 
formation when measuring one ligand concentration with 
one receptor [5 amounted to increasing the sensing ac¬ 
curacy by a constant prefactor, or, equivalently, getting 
only a finite number of additional bits from even a very 
long measurement [25 . In contrast, here we show that 
two concentrations can be measured with one receptor 
with the variance that decreases inversely proportionally 


to the number of observations, n, Eq. (14), or to the inte¬ 


gration time, 1/rB, Eq. (24), so that the accuracy is only 
a (small) prefactor lower than would be possible with one 
receptor per ligand species. Asymptotically, this doubles 
the information obtained by the receptor [25] . 

In principle, one can measure more than two concen¬ 
trations similarly, as long as all species have sufficiently 
distinct off-rates. While the error (the variance for the 
ML estimator, and both the bias and the variance for 
the approximate and the KPR estimators) would grow 
with a larger number of ligand species, this would still 
represent a dramatic increase in the information gained 
by the receptor that keeps track of its precise temporal 
dynamics, rather than just the average binding state. 

Crucially, such improvement would not be possible 
without the cross-talk, or binding among noncognate lig¬ 
ands and receptors. Normally, the cross-talk is consid¬ 
ered a nuisance that must be suppressed [261 [27]. In¬ 
stead we argue that cross-talk can be beneficial by re¬ 
cruiting more receptor types to measure concentration 
of the same ligand. In particular, this allows having 
fewer receptor than ligand species, potentially illuminat¬ 
ing how cells function reliably in chemically complex en¬ 
vironments with few receptor types. Further, the cross¬ 
talk can increase the dynamic range of the entire system: 
a ligand may saturate its cognate receptor, preventing 
accurate measurement of its (high) concentration, but 
it may be in the sensitive range of non-cognate recep¬ 
tors at the same time. Finally, the increased bandwidth 
may lead to improvements in sensing a time-dependent 
ligand concentration mmsi. We will explore such many- 
to-many sensory schemes, extending ideas of Ref. (28j to 
tracking temporal sequences of activation of receptor and 
to varying environments in forthcoming publications. 

While the exact maximum likelihood inference of mul¬ 
tiple concentrations from a temporal binding-unbinding 
sequence is rather complex, we showed that when the 
cognate and the non-cognate off-rates are substantially 
different, there is a simpler, approximate, but accurate 
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inference procedure. In various immune system prob¬ 
lems, r nc /r c rsj 5 5 which would allow the approximation 
to work. Moreover, when the receptor is not saturated 
and spends most of its time unbound, this inference can 
be performed by biochemical motifs readily available to 
the cell. Namely, one needs two branches of activation 
downstream of the receptor, with one of them having a 
kinetic proofreading (KPR) time delay, and then an esti¬ 
mate of the difference of activities of the branches. This 
suggests a possible signal estimation role for the KPR 
scheme in addition to the more traditional signal detec¬ 
tion one HU na ng. Such branching and merging of 
signaling pathways downstreams of a receptor is com¬ 
mon in signaling [271129] , Thus exploring the function of 
such complex organization in the context of estimation 
of multiple signals with cross-talk is in order. 

In summary, monitoring precise temporal sequences of 
receptor activation/deactivation opens up new and excit¬ 
ing possibilities for environment sensing by cells. 
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